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Abstract 

We measure the top quark mass mt using tt pairs produced in the D0 detector 
by y/s = 1.8 TeV pp collisions in a 125 pb _1 exposure at the Fermilab Teva- 
tron. We make a two constraint fit to mt in ti — ► bW + bW~ final states with 
one W decaying to qq and the other to ev or fiv. Events are binned in fit mass 
versus a measure of probability for events to be signal rather than background. 
Likelihood fits to the data yield m t = 173.3 ± 5.6 (stat) ± 6.2 (syst) GeV/c 2 . 
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The top quark has a large mass m t that can be determined to greater fractional precision 
than is possible for the lighter quarks, which decay after they form hadrons. Since m t is large, 
it controls the strength of quark-loop corrections to tree-level relations among electroweak 
parameters. If these parameters and m t are measured precisely, the Standard Model Higgs 
boson mass can be constrained. 

Direct measurements of mt have been published as part of the initial observations U of 
ti production in -^/i = 1.8 TeV pp collisions. At present, the best accuracy in mt is achieved 
for lepton + jets (£+jets) final states in which one W boson (from t — > bW) decays to ev 
or [iv and the other W decays to a qq pair that forms jets. We report a measurement of m t 
in the £+jets channel using the ^125 pb -1 exposure of the D0 detector during the 1992-96 
Fermilab Tevatron runs. Since Ref. appeared, our data sample has doubled, and for a 
fixed sample size our error on mt has halved. 

The D0 detector and our basic methods for triggering, reconstructing events, and identi- 
fying particles are described elsewhere . Recent advances include enhanced triggering and 
reconstruction efficiency for /x+jets events, due in part to better use of calorimeter data. As 
a signature of W — > £u, we require missing energy transverse to the beam (Et) > 20 GeV, 
and one isolated e or /i (£) with E T > 20 GeV and pseudorapidity \rj e \ < 2 or |r/ M | < 1.7. 
We also demand E^ 1 > 25 (20) GeV for e+jets (/z+jets) events, where E^ 1 is Et measured 
only in the calorimeter. As signatures of the qq from W decay and the b and b from t and i 
decay, we require >4 jets reconstructed with cones of half-angle ATZ = (A0 2 + At] 2 ) 1 / 2 = 0.5, 
having E T > 15 GeV and \r]\ < 2. 

Within ATI = 0.5 of a jet axis, additional muons (/i tags) satisfying p T > 4 GeV/c and 
1^1 < 1.7 arise mainly from b and c quark semileptonic decay. These occur in ^20% of ti 
events but only ^2% of background events 0. In untagged events, to suppress background 
we require Ej, (= \E T \ + \E T \) > 60 GeV and \r) W \ < 2 for the W -> tv. The latter cut, 
exhibited in Fig. |l](a), reduces the difference in r\ w distributions between data and Monte 
Carlo (MC) simulated background. We use the herwig MC || to simulate top signal, 
and the VECBOS MC [|J (with herwig fragmentation of partons into jets) to simulate (but 
not to normalize) the dominant iy+multijet background. The ~20% of background events 
from non-iy sources are modeled by multijet data that barely fail the lepton identification 
criteria. 

To each event passing the above cuts, we make a two constraint (2C) kinematic fit || 
to the ti — > £+jets hypothesis by minimizing a % 2 = (v — v*) T G(v — v*), where v (v*) is 
the vector of measured (fit) variables and G~ l is its error matrix. Both reconstructed W 
masses are constrained to equal the W pole mass, and the same fit mass mg t is assigned 
to both the t and i quarks. If the event contains >4 accepted jets, only the four jets with 
highest Et are used. In ^50% of MC top events, these jets correspond to the b, b, q, and q. 
With (without) a fi tag in the event, there are 6 (12) possible fit assignments of these jets 
to the quarks, each having two solutions to the v longitudinal momentum p v z . We use 
only from the permutation with lowest x 2 > the correct choice for ^20% of MC top events. 
Because of the ambiguities, is not the same as m t , though they are strongly correlated. 
Our best estimate of m t is obtained from the best match between MC samples and the data. 

From the 90-event distribution shown in Fig. 0(b) we select 77 events with a 2C fit 
satisfying x 2 < 10. Of these, 5 are /i tagged and ~65% are background. Further separation 
of signal and background events is based on four kinematic variables x = {xi, x 2 , x 3 , x 4 } 



5 



chosen to have small correlation with rngt- On average, all are larger for MC top events than 
for background events, selected to have the same (m fit ) as the top events ||. The simpler 
variables are x% = $t and Xi = A, where aplanarity A is |x the least eigenvalue of the 
normalized laboratory momentum tensor of the jets and the W boson. The third variable 
23 = Ht2/H z measures the event's centrality, where H z is the sum of \p z \ of £, u, and the 




x l = $ T (GeV) 100 



x 2 = A 



0.3 




FIG. 1. Events per bin vs. event selection variables defined in the text, plotted for (a-b, g-h) 
top quark mass analysis samples, and (c-f) jet control samples. Histograms are data, filled 

circles are expected top + background mixture, and open triangles are expected background only. 
Solid arrows in (a-b) show cuts applied to all events; the open arrow in (g) illustrates the LB cut. 
The nonuniform bin widths in (g-h) are chosen to yield uniform bin populations. 
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jets, and Ht2 is the sum of all jet \Er\ except the highest. Finally, £4 = ATZJj n E^ m /E^ 
measures the extent to which jets are clustered together, where A7^ in is the minimum A7Z 
of the six pairs of four jets, and E™ m is the smaller jet from the minimum A1Z pair. As 
shown for the background dominated jet sample in Fig. |l](c-f), X1-X4 are reasonably 

well modeled by MC; this is true also for the W+2 jet and top mass samples (not shown). 

We bin events in a two-dimensional array with abscissa mst and ordinate -D(x), where 
D is a multivariate discriminant. To show that our results are robust, we use two methods 
for which the definition of D, the granularity with which it is binned, and the additional 
requirements are different. In our "low bias" (LB) method, we first parametrize Ci{xi) = 
Si(xi)/bi(xi), where s< and 6« are the top signal and background densities in each variable, 
integrating over the others. We form the log likelihood ln£ = X^^jhi/^, where the weights 
uji are adjusted slightly away from unity to nullify the average correlation ("bias") of £ with 
m fit , and for each event we set D LB = £/(l + £). Finally, we divide the ordinate coarsely 
into signal- and background-rich bins according to whether the LB cut is passed. This cut is 
satisfied if a fi tag exists; otherwise it is not satisfied if D LB < 0.43 (Fig. [3](g)) or if H T2 < 90 
GeV. 

Our neural network (NN) method is sensitive to the correlations among the Xi as well as 
to their individual densities. We use a three layer feed-forward NN with 4 input nodes fed by 
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FIG. 2. Events per bin (oc areas of boxes) vs. -Dnn (ordinate) and mgt (abscissa) for (a) 
expected 172 GeV/c 2 top signal, (b) expected background, and (c) data. -Dnn is binned as in 
Fig. 0(h). 
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x, 5 hidden nodes, and 1 output node, trained on samples of top signal (background) with 
density s(x) (6(x)) f?J. For a given event, the network output -Dnn approximates the ratio 
s(x)/(s(x) + 6(x)). We divide the ordinate finely into ten bins in -D N n, independent of H T2 or 
H tagging. Figure |l|(g-h) shows that D LB and -D N n are distributed as predicted and provide 
comparable discrimination, as we expect when the u){ are close to unity and the Li are not 
strongly correlated. Figure |2| exhibits the arrays for the NN method. Little correlation 
between -Dnn and mgt is evident in the expected signal or background distributions, which 
are distinct; the data clearly reveal contributions from both sources. Figure [5] shows the 
distributions of m fit for data (a) passing and (b) failing the LB cut. 




80 120 160 200 240 280 

Fit top quark mass (GeV/c ) 



FIG. 3. (a— b) Events per bin vs. rngt for events (a) passing or (b) failing the LB cut. His- 
tograms are data, filled circles are the predicted mixture of top and background, and open triangles 
are predicted background only. The circles and triangles are the average of the LB and NN fit 
predictions, which differ by <10%. (c) Log of arbitrarily normalized likelihood L vs. true top quark 
mass mt for the LB (filled triangles) and NN (open squares) fits, with errors due to finite top MC 
statistics. The curves are quadratic fits to the lowest point and its 8 nearest neighbors. In MC 
studies, 7% (27%) of simulated experiments yield a smaller LB (NN) maximum likelihood. 
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TABLE I. Results of fits to data and MC events. Fits to data yield values and errors tr(stat) for 
mt, n s , and n b (described in the text). Systematic errors are combined in quadrature. The resulting 
mt and its statistical error a m are the combined LB and NN values. Fits to MC use ensembles of 
10,000 simulated experiments composed of top + background, with m t , (n a ), and (rib) as listed. 
They yield a mean result (mt), a mean statistical error (c m ), and a range ±5m within which 68% 
of the results fall. Using the LB (NN) method, 6% (25%) of the simulated experiments produce 
a cr m which is smaller than we obtain. For an "accurate subset" of the MC ensembles with mean 
a m /rrit that matches our value, 5m is smaller. 



Fits to data 


—LB fit- 


— NN fit- 


Quantity fit 


value o(stat) 


value o(stat) 


m,(GeV/c 2 ) 


174.0 ± 5.6 


171.3 + 6.0 


n s 


23.8 +8-3-7.8 


28.8 +8-4-9.1 


n b 


53.2+10-7 _9.3 


48.2+H-4 -8.7 


Systematic error on m t 


energy scale + 


4.0 




generator + 


4.1 




other + 


2.2 


Resulting m t (GeV/c 2 ) 


173.3 ± 5.6 (stat) + 6.2 (syst) 


Fits to MC type 


— input — 


-output 


(top + background) of fit m t (n s )(n b ) (o m > 


{m t ) hm 


full ensemble LB 175 24 53 9.9 


175.0 8.7 


NN 172 29 48 8.5 


171.6 8.0 


accurate subset LB 175 24 53 5.5 


175.3 4.6 


NN 172 29 48 5.8 


172.0 6.0 



To each m t for which we have generated MC, we assign a likelihood L which assumes 
that all samples obey Poisson statistics. Bayesian integration || over possible true signal 
and background populations in each bin yields 

L(m t , n s , n b ) = fi E ( 3 ) ( % + * ) + ^ "~ ' ' + V^-^ , 

where n s (n b ) is the expected number of signal (background) events in the data; rij, n si , and 
ribi are the actual number of data, MC signal, and MC background events in bin i; k = rii—j; 
p s ,b = ri s , b /(M + J2in S i,bi)'i and M = 40 (200) bins for the LB (NN) methods. Maximizing L 
for each m t gives the best estimates n*{m t ) and nl(m t ) for n s and n b . Figure |](c) displays 
In L(m t , n*(m t ),nl(rnt)) vs. m t , where the curves determine the best fit m t and its statistical 
error <j m . 

Table | presents the fit results, which are consistent with Ref. |l[] and with recent re- 
ports Q]. The LB and NN results m\ B and mf N are mutually consistent; in 21% of MC 
experiments they are further apart. Nevertheless we include half of m^ B — mf N in the sys- 
tematic error. To obtain our result, shown in Table [I], we combine m\ B and mf N allowing 
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for their (88 ±4)% correlation (determined by MC experiments). Figures |3](a-b) show that 
this result represents the data well. From the MC experiments summarized in Table | we 
measure the interval ±5m within which 68% of the MC estimates fall. For the full ensemble, 
5m is larger than a m from our data. However, for "accurate subsets" of the ensemble for 



which the average a m /m t is the same as we observe, 5m is close to a m flOfl . 

A principal systematic error in m t arises from uncertainty in the jet energy scale, which 
is calibrated in three steps. In step 1, applied before events are selected, the summed energy 
£j et of particles emitted within the jet cone is related |lTJ to the measured energy E m by 
.Ejet = (E m — 0)/R{l — S). Here the calorimeter response R is calibrated using Z — > ee 
decays and E T balance in 7+jet events, the fractional shower leakage 5* out of the jet cone 
is set by test beam data, and the energy offset O due to noise and the underlying event is 
determined using events with multiple interactions. Steps 2 and 3 are applied only to jet 
energies used to find m fit . In step 2, top MC is used to correct E- ]et to the parton energy 
in both data and MC. This sharpens the resolution in mg t . Step 3 is a final adjustment 
based on more detailed study of 7+jet events in data and MC, particularly focused on the 
dependence of the Et balance upon r\ of the jet. We assign a jet-scale error of ±(2.5% + 0.5 
GeV) based on the internal consistency of step 3, on variations of the 7+jet cuts and the 
model for the underlying event, and on an independent check of the E? balance in Z+jet 
events. This leads to an error on m t of ±4.0 GeV/c 2 . 

We estimate the uncertainties in modeling of QCD by substituting the isajet MC gen- 
erator |12] for herwig, independently for top MC and for vecbos fragmentation, and by 
changing the vecbos QCD scale from jet (pt) 2 to M^. The resulting systematic error due 
to the generator is ±4.1 GeV/c 2 . Other effects including noise, multiple pp interactions, 
and differences in fits to InL contribute ±2.2 GeV/c 2 . All systematic errors (Table [I]) sum 
in quadrature to ±6.2 GeV/c 2 . Therefore our direct measurement of the top quark mass is 
m t = 173.3 ± 5.6 (stat) ± 6.2 (syst) GeV/c 2 . 

We thank the staffs at Fermilab and the collaborating institutions for their contributions 
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